arXiv:cs/0512063vl [cs.IT] 15 Dec 2005 


IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 3, MARCH 2006 


1 


Complex Random Vectors and ICA Models: 
Identifiability, Uniqueness and Separability 
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Abstract — In this paper the conditions for identifiability, sepa¬ 
rability and uniqueness of linear complex valued independent 
component analysis (ICA) models are established. These re¬ 
sults extend the well-known conditions for solving real-valued 
ICA problems to complex-valued models. Relevant properties 
of complex random vectors are described in order to extend 
the Darmois-Skitovich theorem for complex-valued models. This 
theorem is used to construct a proof of a theorem for each of 
the above ICA model concepts. Both circular and noncircular 
complex random vectors are covered. Examples clarifying the 
above concepts are presented. 

Index Terms — Blind methods, circularity, complex linear mod¬ 
els, complex Darmois-Skitovich theorem, differential entropy, 
independent component analysis (ICA), noncircular complex 
random vectors, properness. 

I. Introduction 

Independent component analysis (ICA) [1] is a relatively 
new signal processing and data analysis technique. It may be 
used, for example, in blind source separation (BSS) and iden¬ 
tifying or equalizing instantaneous multiple-input multiple- 
output (I-MIMO) models. It has found applications, e.g., in 
wireless communications, biomedical signal processing and 
data mining (see [2] for references). In instantaneous complex¬ 
valued ICA problem 

x = As, (1) 

the goal is to recover the original source signal vectors s from 
the observation vectors x blindly without explicit knowledge 
of the sources or the linear mixing system A. ICA is based 
on the crucial assumption that the underlying unknown source 
signals are statistically independent. Recent textbooks provide 
an interesting tutorial material and a partial review on ICA 
[2], [3], 

The theorems for linear combinations of real-valued random 
vectors and theoretical conditions on separation for real¬ 
valued signals are now well-known [1], [4], [5], Even though 
algorithms for separation of complex-valued signals have been 
developed, for example [1], [6], the conditions when the 
separation is possible have not been established. Also recent 
papers, e.g., [7]—[10], proposing ICA algorithms for complex¬ 
valued data ignore this important issue. 

In this paper we construct theorems stating the conditions 
for identifiability, separability, and uniqueness of complex¬ 
valued linear ICA models. These results extend the theorems 
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proved for the real-valued instantaneous ICA model [1], [5] 
to the complex case. Both circular (proper) and noncircular 
complex random vectors are covered by the theorems. These 
conditions depend not only on the probabilistic structure of the 
sources but also the linear space structure of the mixing. In 
order to prove the theorems, the celebrated Darmois-Skitovich 
theorem [4] needs to be extended to linear combinations 
of complex random variables. A good number of statistical 
properties of circular and noncircular complex vectors have 
to be considered in the process of constructing the proof. 
This is due to the special operator structure that may be 
used for complex random vectors. In addition, the second 
order statistical properties of noncircular complex vectors may 
not be defined using the covariance matrix alone [11]—[13]. 
General complex Gaussian random vectors is an important 
class of random vectors that need to be addressed in detail. 
There are relatively few papers where noncircular complex 
random vectors are studied [11]—[16]. Hence, many of the 
key results needed in proving the theorems are included in 
this paper and presented in a unified manner. This also allows 
a direct derivation of some fundamental information-theoretic 
quantities like the entropy of a complex normal random vector. 

The paper is organized as follows. In Section un relevant 
properties that distinguish complex random vectors from real 
random vectors are described in detail. Especially, the correla¬ 
tion structure is used to study complex normal random vectors. 
These properties are needed in proving the Darmois-Skitovich 
theorem for the complex case. This theorem plays a key role 
in establishing the conditions for identifiablity, separability 
and uniqueness of complex linear ICA models in Section EH 
Finally, some concluding remarks are given. Most of the proofs 
are presented in appendices. 

II. Relevant Properties of Complex random 
vectors 

The traditional probability theory is concerned with real¬ 
valued random variables (r.v.s) and random vectors (r.vc.s). 
The theory has been generalized to various algebraic struc¬ 
tures. Main studies are in the frameworks of locally compact 
spaces and complete separable metric spaces (see, e.g., [17]— 
[20] and references therein). However, the most natural exten¬ 
sion from the engineering point of view is the complex Hilbert 
space. It seems to have gained relatively little attention. Some 
results on complex normal r.vc.s can be found in [21], [22]. 
The second-order structure of complex r.vc.s has been studied 
in [11]—[13], [15], and a general framework for higher-order 
statistics can be found from [23]. Some research has been 
conducted on complex elliptically symmetric distributions [24] 
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and on complex stable distributions [25], Polya’s theorem to 
complex case is presented in [26], The only systematic Hilbert 
space approach known to the authors is [14]. This may be 
due to the fact that the additive structure of the complex 
Hilbert space is the same as that of the real Hilbert space. 
However, the multiplicative structure and the operator structure 
are different giving r.vc.s in a complex Hilbert space distinct 
properties. Even though many results from the general abstract 
theory apply directly to the complex Hilbert space case, the 
systematic treatment considering both the additive and the 
multiplicative structure seems to be missing. 

In Section lll-BI the finite dimensional Hilbert space is 
reviewed by constructing an isomorphism into a real-valued 
Hilbert space. This isomorphism shows essentially the dif¬ 
ference between the real and complex Hilbert spaces. In 
Section IH-CI some basic properties of r.vc.s in the complex 
Hilbert space are stated, the second-order structure of complex 
r.vc.s is studied in Section HTDl Complex normal r.vc.s are 
studied is Section HleI and, finally the complex Darmois- 
Skitovich theorem is proved in Section urn 

A. Notation 

Let us begin with some definitions and notations. We have 
used typewriter font for all random objects, e.g. x, in order to 
distinguish them from deterministic ones, e.g. x. For random 
vectors, e.g. x, we have used the vec symbol in order to 
separate them from scalar random variables. For deterministic 
objects, the bold face lower case letters are used for vectors, 
e.g. z, and the bold face upper case letters are used for 
matrices, e.g. W. 

The modulus of a complex number z = zr + jzj £ C is 
denoted \z\ = y/z*z = y/z'^A- Zj, where the superscript * 
denotes the complex conjugate, z* = zr— jzi, and j = \J -1 
is the imaginary unit. Recall that any nonzero complex number 
z can be given in polar form z = ae 3e , where a>O,0£l. 
The number 0 is called an argument of the complex number 
z, and the argument 6 = Arg(z) such that —7r < 6 < n is 
called the principal argument. The real part of a p-dimensional 
complex vector (z\ Z 2 • • • z p ) T = z £ C p , where T is 
the ordinary transpose, is denoted by zr and the imaginary 
part by z/. The Euclidean norm of a vector z is denoted 
|| z || 2 = (z, z) = z H z, where (•,•) is the inner product and 
the superscript H denotes the conjugate transpose, i.e., the 
Hermitian adjoint. A complex matrix C £ C pxp is termed 
[27] symmetric if C T = C and Hermitian if C H = C. 
Furthermore, the matrix C is orthogonal if C T C = CC T = 
I p and unitary if C H C = CC H = I p , where I p denotes the 
p x p identity matrix. 

B. Complex Hilbert space isomorphism 

Let C = C R + jCr £ C mxp and z = z R + jzj € C p . We 
use the following notations 

c * = (c? ~cj) ” d = (?) (2 » 

for the associated 2m x 2 p real matrix and 2p-variate real 
vector, respectively. The mapping z h zb gives naturally a 


group isomorphism between the additive Abelian groups C p 
and M 2p . In the case m = p = 1, the mapping given by 
C i—> Cr defines afield isomorphism (e.g., [14], [22]) between 
the complex numbers and a subset of real two dimensional 
matrices. Therefore, one can construct real structures where 
the role of complex multiplication is played by the special 
matrices. 

Now consider the mapping 

Cz i ^ (Cz)„ = CrZr. (3) 


It is continuous and therefore preserves the topological prop¬ 
erties, i.e., it is a homeomorphism [19]. Let diag(z) (as in 
Matlab) denote the diagonal matrix with components of z in 
its main diagonal and zeros elsewhere. Since C p is a vector 
space, where the scalar multiplication for c £ C is given by 


cz 


A 



diag((c 


c))z, 


(4) 


the mapping 0 defines a vector space isomorphism between 
the standard p-dimensional complex vector space and a 2 p- 
dimensional real-valued vector space given by the mapping. It 
is important to realize that this associated real-valued vector 
space is not isomorphic to the standard real vector space M 2p . 
Furthermore, by equating z( / with C in 0 it is easily verified 
that the mapping C —■> R 2 : z^Z 2 i—> (z^) R (z 2 ) R associates 
a (complex) inner product for M 2p . Therefore, the mapping 
0 is also a Hilbert space isomorphism. Again, it should be 
emphasized that the inner product given by the mapping is 
not the standard Euclidean inner product in R 2p . However, 
the vector norms, and hence metrics, are equivalent in both. 
The following properties are easily established. 

Lemma 1: Let C £ C pxp and z £ C p . 

(i) |det(C)| 2 =det(C R ). 

(ii) C is Hermitian iff Cr is symmetric. Then det(C) = 
det(CR) and 2 x rank(C) = rank(CR). 

(iii) C is nonsingular iff Cr is nonsingular. 

(iv) C is unitary iff Cr is orthogonal. 

(v) z H Cz = Z r CrZr 

(vi) C is Hermitian positive definite iff Cr is symmetric 
positive definitive. 

(vii) Any polynomial with complex coefficients in variables 
zr can be equivalently given in variables (z,z*). 

Proof: These properties are direct consequences of the 

isomorphism, see, e.g., [22], [24], The last property follows 
from the identities zr = 4(z + z*) and z/ = 4^(z — z*). ■ 
Since the variables (z, z*) in Lemma 11 Iviil are dependent, 
we call such complex polynomials wide sense polynomials. 
The idea of using also the complex conjugate variable has 
turned out to be highly useful in, e.g., complex parameter 
estimation [28] and blind channel equalization [16]. 


C. Complex random vectors 

A p-variate complex random vector (r.vc.) x is defined as 
an r.vc. of the form 


X = XR + JX/ 


(5) 
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where xr and x/ are p-variate real r.vc.s, i.e., xr and xj are 
measurable functions from a probability space to \SJ\ This 
is equivalent for x to be measurable from the probability 
space into O' due to the separability of the complex space. 
Therefore, the probabilistic structure of the r.vc.s in C p and the 
probabilistic structure of the r.vc.s in M 2p is the same. How¬ 
ever, the operator structure is different as it is evident from 
the previous section. This gives distinct properties to the r.vc.s 
with complex values, and justifies studying them separately. 
Throughout this paper all complex r.vc.s are assumed to be 
full. This means that the support of the induced measure of a 
p-dimensional r.vc. is not contained in any lower dimensional 
complex subspace. 

Since the probabilistic structures of r.vc.s in C p and in R 2p 
are the same, also the operator structure of r.vc.s in C p can be 
studied by first using the isomorphism 0 and then applying 
the concepts associated with the real r.vc.s. However, we define 
these associated concepts directly on O', since this approach 
is notationally more convenient. 

The expectation E[-] of a complex r.vc. x is defined as 

E ? [x] = E ?fl [x fl ] + j E ?/ [x 7 ], (6) 

and the distribution function F% is given as F$( z) = F$ x ( zr), 
where z = ( zy,..., z p ) T £ C p and denotes the distri¬ 
bution function of real-valued r.vc. xr. Then for independent 
r.v.s (si,..., s P ) T = s, we have 

F S (z) = F Sft ( z R ) = f[ F {sk)x ((z k ) R ) = f[ F Sk (z k ). (7) 

k—1 k —1 

The same way we define the probability density function 
(if it exists) of a p-dimensional complex r.vc. x as fff z) = 
/x*(zr), and the characteristic function (c.f.) [14] as 

^ic(z) = ^,(zr) = Ef K [exp (j(zr , xr) ) ] 

= E ? [exp(jRe{(z,x)})]. 

It follows directly from Eq. 0 that for independent complex 
r.v.s (si,... ,s p ) T = s, 

P 

^(z)= n 

*:=i 

Using a standard property of real c.f.s and the properties of 
the isomorphism 0 , we have a useful relation for the c.f. of 
an r.vc. x and the c.f. of the linearly transformed r.vc. Cx. 
Namely, for any complex matrix C, we have 

<^Cx(z) =</2 (Cx) r (zr) = ^Ckx k (zr) = </22 r ((Cr) T Zr) 

^.((C^rZr) = <p Sm ((C H z) R ) = fz(C H z). 

(10) 

Finally, a c.f. ips(z) is called analytic if <P2 r (zr) is an analytic 
c.f. [29], i.e., the real c.f. (^(zr) has a regular extension 
defined on C 2p in some neighborhood of the origin. 

D. Second-order statistics of complex random vectors 

An r.vc. x has finite second order or weak second order [14] 
statistics if Ej[|(x, z)| 2 ] < oo for all z € C p . This is clearly 
equivalent to the existence of finite second order statistics for 


both real r.vc.s xr and xj. All r.vc.s in this section are assumed 
to have finite second order statistics. Such r.vc.s are in general 
called second-order complex r.vc.s. 

The second-order statistics between two real r.vc.s may be 
described by the covariance matrix. The complex covariance 
matrix cov[xi,x 2 ] of two complex r.vc.s xq and x 2 may be 
defined as 

cov[xi,x 2 ] = Ej iA [(xi- E 2l [xi])(x 2 -E ?2 [x 2 ]) fl ]- (11) 

However, considering the real representations of the complex 
r.vc.s, it can be seen that the complex covariance matrix does 
not give complete second order description. For that we define 
the pseudo-covariance matrix 1 pcov[xi,x 2 ] [11] as 

pcov[xi,x 2 ] =E 5liif2 [(xi - E 2i [xi])(x 2 -E ? 2 [x 2 ]) t ] 

= cov[xi,x£], 

( 12 ) 

Two complex r.vc.s xi and x 2 are uncorrelated if real r.vc.s 
(xi) R and (x 2 ) r are uncorrelated, i.e., cov[(xi) R , (x 2 ) R ] = 
C* 2 px 2 p, where 0 2pX 2 p denotes the 2 p x 2 p matrix of zeros. 
Then, by using the properties from the previous section, the 
following lemma [11] follows directly. 

Lemma 2: Complex r.vc.s xi and x 2 are uncorrelated if and 
only if cov[xi,x 2 ] = pcov[xi,x 2 ] = 0 pxp . 

As it is the case with real r.vc.s, the internal correlation 
structure of a single r.vc. x may be of interest in addition 
to correlation between two r.vc.s. Then we define cov[x] = 
cov [x, x] and pcov [x] = pcov [x, x], and call them the 
covariance matrix and the pseudo-covariance matrix of an 
r.vc. x, respectively. It is easily seen that the covariance 
matrix cov [x] is Hermitian and the pseudo-covariance matrix 
is symmetric. Since all r.vc.s are assumed to be full, the 
covariance matrix cov[x] is also positive definite. R.vc. x is 
said to have uncorrelated components if all its marginal r.v.s 
x k and x/, k f l. are uncorrelated. The following lemma is a 
simple consequence of Femma |2] 

Lemma 3: A complex r.vc. 3? has uncorrelated components 
if and only if its covariance matrix and pseudo-covariance 
matrix are diagonal. 

An r.vc. x is said to be spatially white, if cov[x] = cr 2 I p 
for some er 2 > 0. If pcov[x] = 0 pxp , then the r.vc. is called 
second order circular (or circularly symmetric). Some authors 
prefer the term proper [11], [14]. Circular r.vc.s have gained 
most of the attention in the literature of complex r.vc.s. This is 
likely due to the fact that all the second order information of 
circular r.vc.s is contained in the covariance matrix, which, on 
the other hand, behaves like the covariance matrix for the real 
r.vc.s. However, in this paper we need the complete second- 
order description to be derived next. Our approach is to our 
best knowledge novel, mainly based on the following theorem. 
For alternative characterizations, see [12]—[14]. 

Theorem 1: Any full complex p-dimensional r.vc. x with 
finite second order statistics can be transformed by using 
a nonsingular square matrix C such that the r.vc. s = 
(si,..., s p ) T = Cx has the following properties: 

The pseudo-covariance matrix is called the relation matrix in [12] and the 
complementary covariance matrix in [13]. 
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(i) cov[s] = I p 

(ii) pcov[s] = diag(A[s]), where A[s] = (Ai,...,A p ) t 
denotes a vector such that Ai > • • • > A p . 

Proof: It is easily verified that cov [Cx] = C cov [x] C H 
and pcov[Cx] = Cpcov[x]C T . By Corollary 4.6.12(&) in 
[27], if a matrix A is Hermitian and positive definite and a 
matrix B is symmetric, then there exists a nonsingular matrix 
C such that CAC H = I p and CBC T is a diagonal matrix 
with nonnegative diagonal entries. Since the covariance matrix 
is Hermitian and positive definitive and the pseudo-covariance 
matrix is symmetric, the proof is completed by noticing that 
the diagonal entries can be ordered by permutating the rows 
of C. m 

Since cov[x] = cov[xj?] + cov[xj] and pcov[x] = 
cov[xfl] — cov[xj] + 2jcov[xj?, xj] for any complex 
r.v. x = xj?. + jx/, it follows that in Theorem |7| 
cov Rejsfc},Im{sfc}] = 0 and 1 > \ k = cov[Re{sfc}] — 
cov Imjsfc}] > 0, k = 1 The r.vc.s satisfying the 

properties of Theorem [^liave a special structure, and they are 
here called strongly uncorrelated. Any strongly uncorrelated 
r.vc. is white with cov[s] = I p , but the converse is not true. 
In general, for a given r.vc. x, the strongly uncorrelated r.vc. s 
and the strong-uncorrelating transform C given by Theorem^ 
are not unique. However, we have the following. 

Theorem 2: For a given r.vc. x, the vector A[s] in Theo¬ 
rem [2 is unique. 

Proof: Suppose there exist two nonsingular transforma¬ 
tions C i and C 2 such that r.vc.s si = Cix and s 2 = C 2 x 
satisfy the properties in Theorem □ Let C 1 = U rAxVf 
and C 2 = U 2 A 2 V 2 be the singular value decompositions 
(SVD) (see [27]) of the transform matrices. Now I p = 
Ci cov [3?] C± = C 2 cov[x] C^, and therefore cov[x] = 
V"iA 1 V 1 = V 2 A 2 V 2 ■ Since cov[xJ is positive definite, 
it follows ViAiVf = V 2 A 2 V 2 . Now 

pcov[si] =17iAiyf pcov[x] VJAiUf 

=L7t(Vf VrJAj V? pcov [x] VjA^Vf V*)Uj 
=UfV?{V x A x V?) pcov [x] (V^VDVlUl 
=l/i Vf (V 2 A 2 V?) pcov [x] (V* 2 A 2 V^)VIU^ 
=J7iVfV 2 (r/fE/ 2 )A 2 Vf pcov[x] 
V* 2 K 2 {UlU* 2 )VlV\Ul 
=l/iVfV 2 C7?(E7 2 A 2 Vf pcov[x] 
VlK 2 U T 2 )U* 2 V T 2 VlUl 
=l/i Vf V 2 l/f pcov[s 2 ] U* 2 VlV\Ul, 

(13) 

and since U{ViV 2 U 2 is unitary, pcov[si] and pcov[s 2 ] 
have the same singular values. Since by the assumption 
pcov[si] and pcov[s 2 ] are diagonal with sorted entries, it 
follows pcov [si] = pcov [s 2 ]. ■ 

Remark 1: The proof of Theorem|2]gives a way to construct 
a strong-uncorrelating transform C as follows: 

(i) Find the usual whitening transform D = cov[x] 2 , i.e., 
the inverse of the matrix square root of cov [x]. 

(ii) Any symmetric matrix B has a special form of SVD 
known as Takagi’sfactorization (see [27]). The factoriza¬ 


tion is given as B = UAU T , where U is unitary and A 
is a diagonal matrix with real nondecreasing nonnegative 
main diagonal entries. An example of the factorization 
is given in Eq. O- Hence, find pcov[Dx] = UAU T . 

(iii) Set C = U H D. 

Notice also that the vector A [s] contains the singular values 
of the pseudo-covariance matrix of a white r.vc. with unit 
variances. 

The previous theorems lead to a useful characterization of 
second-order complex r.vc.s. 

Definition 1: The vector A[x] = A[s] = (Ai,..., A P ) T in 
Theorem Q is called the circularity spectrum of an r.vc. x. An 
element of the circularity spectrum corresponding to an r.v. is 
called a circularity coefficient. 

Any r.vc. x is clearly second order circular if and only if 
its circularity spectrum is a zero vector, i.e., A[x] = 0 px i- 

Corollary 1: If the circularity spectrum of an r.vc. has 
distinct elements, all rows corresponding to nonzero circularity 
coefficients of the strong-uncorrelating transform are unique 
up to multiplication of the row by —1. A row corresponding 
to the zero coefficient is unique up to multiplication of the 
row by e ° e , 9 G R. 

Proof: The left unitary factor in the SVD of a block 
matrix with distinct singular values is determined up to right 
multiplication by the matrix A = diagfe^ 1 ,..., e j6p ) and the 
right unitary factor is determined by the left unitary factor [27]. 
In the special form for a symmetric matrix (Takagi’s factor¬ 
ization), 9 k = 0 or 9 k = 7 r for the values of k corresponding 
to nonzero singular values. Therefore, UiV^V 2 U 2 = A in 
Eq. (TiT and 


C 1 =17^! Vf = U l V*{V 1 AjVf) 

=AU 2 V*(V 2 A 2 V*) = AU 2 A 2 V f = A C 2 


(14) 


by the proof of Theorem 13 ■ 

Some properties of the circularity coefficient are listed in 
the following lemma, whose proof is given in Appendix U 
Lemma 4: Let x and y be uncorrelated second-order com¬ 
plex r.v.s. Then 

(i) 0 < A[cx] = A[xl = I P ° r U I < 1 for any nonzero 

COV X 

constant c £ C, 

(ii) A[x] =1 if and only if x = c(sj? + ja) for some unit 
variance real r.v. s?? and deterministic constants 0 f c £ 


C, a £ 


(iii) A[x + y] = I p cov H +p co ^ [y] I ^ max { A [ X ], A [y] } 

cov x +cov y 

with the equality if and only if A[x] = A[y] and 
Arg(pcov[x]) = Arg(pcov[y]) if A[x] ^ 0. 


E. Complex normal random vectors 

There are no commonly agreed definitions of what is meant 
by complex normal r.vc.s. It is natural to require that a r.vc. 
x is normal (Gaussian) if the real r.vc. xr is multivariate 
normal. Such r.vc.s are generally called wide sense normal 
r.vc.s [14]. Since the real complex normal r.vc. is completely 
characterized by its mean vector and covariance, the results 
from the previous section show that a wide sense complex 
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normal r.vc. is completely specified by its mean, covariance 
matrix, and pseudo-covariance matrix. 

However, all wide sense normal r.vc.s do not possess all the 
properties that real normal r.vc.s do. Only a special subclass 
of wide sense normal r.vc.s has a density function similar to 
the real r.vc.s [21], [22], maximizes the entropy [11], or has 
the 2-stability property (Polya’s characterization) [26]. Such 
r.vc.s are called narrow sense normal r.vc.s [14], They are 
wide sense normal r.vc.s such that the real and imaginary parts 
of any linear projection of the r.vc. are independent and have 
equal variances. This condition is equivalent to the requirement 
that a wide sense normal r.vc. is second order circular (see, 

e.g., [11])- 

In order to establish the properties of the complex ICA 
model of Eq. 0, neither wide sense normal in its full general¬ 
ity nor narrow sense normal is adequate, and a more specific 
characterization of complex normal r.vc.s is needed. This is 
done next. From now on, we will use the term “complex 
normal” to mean wide sense complex normal r.vc. 

The main result is the following decomposition theorem for 
complex normal random vectors. 

Theorem 3: An r.vc. n is complex normal with circularity 
spectrum A if and only if 

n = C(ff R +jffi) + p (15) 

for some nonsingular matrix C, a complex constant vector /x, 
and multinormal real independent r.vc.s ff R ~ N (0 pxl , \l p + 
| diag(A)) and rfr ~ 7V(0 pxl ,±Ip - idiag(A)). Also 
cov [n] = CC H , pcov[n] = Cdiag(A)C r , and Eg[n] = /x. 

Proof: It is obvious that the r.vc. n in Eq. is complex 
normal, cov[n] = CC H , pcov[n] = Cdiag(A)C T , and 
Eg [n] = /x. Thus, it remains to show that any complex normal 
r.vc. can be given the form 03- 

Let n be a complex normal r.vc. Without loss of generality 
assume it is zero mean. By Theorem 0 there exists a nonsin¬ 
gular matrix D such that cov [Del] = I p and pcov[Dn] = 
diag(A). Let ff R ~ N(O pxl ,±I p + \ diag(A)) and ff, ~ 
JV(O p xi, \l p — i diag(A)) be real independent r.vc.s. Now 
co <w[ff R + jffi] = iJ p + \ diag(A) + \l p - \ diag(A) = I p 
and pcov[ff R +jrfi\ = diag(A). Hence Dn and rf R +jf)i have 
the same second order structure. Since a zero mean complex 
normal r.vc. is completely characterized by the covariance and 
the pseudo-covariance matrices, it follows Dn = ff R +jffi, and 
the claim follows by setting C = D -1 . ■ 

A complex normal r.vc. ff such that C = I p and /x = 0 px i 
in the representation dl51 . i.e., ff = ffn+jfji, is called standard 
complex normal with the circularity spectrum A. Clearly any 
centered and strongly uncorrelated complex normal r.vc. is 
standard. Also, it is seen that any complex normal r.vc. may 
be alternatively specified by the mean, the circularity spectrum, 
and the (inverse of) strong-uncorrelating matrix C. 

The previous decomposition allows the derivation of dif¬ 
ferential entropy of a complex normal r.vc. in a closed form. 
Entropy h( n) of an r.vc. x is defined as the entropy [30] of the 
real r.vc. xr. The following result has been implicitly derived 
in [31] without reference to circularity coefficients. 

Corollary 2: The differential entropy h( n) of a zero-mean 
complex normal r.vc. n with the circularity coefficients A/, f 


1 , k = 1,... ,p, is given by 


h(n) = log(det(7re cov[n])) + i ^log(l - \ 2 k ). (16) 

Z k =1 

Proof: Let n = Cff be the decomposition given by 
Theorem^ Now det(2cov[?7R]) = — A^), and the 

differential entropy of real-valued normal r.vc. [30] simplifies 
as 

h{ n) log(det(27recov[n R ])) 

log(det(27te cov[C r ?7r] )) 

= i log(det(27reC' R cov[?y R ] Cr)) 

= ilog(det(7reC R CR)) + ^ log(det(2cov[^R])) 

=\ l°g(( 7re ) 2p det((CC H ) R )) + \ log(n(1 - \D) 

k =1 

= ilog((7re) 2p det(cov[n] R )) + ^ log(l - A 2 .) 

Z k= 1 

= ilog((7re) 2p det(cov[n]) 2 ) + ^^log(l - A 2 ) 

Z k =1 

1 P 

= log(det(7re cov[n])) + - ^ log(l - A|) 
z fc= l 

(17) 


by the properties of Lemma ^ ■ 

Since the summation term on the right of Eq. 03 is 
always nonpositive and the entropy of real r.vc.s with the 
given covariance is maximized for Gaussian r.vc.s [30], it 
may be seen that the entropy of complex r.vc.s with the given 
covariance is maximized for a narrow sense complex normal 
r.vc. [11], i.e., for a complex normal r.vc. with zero pseudo¬ 
covariance. Theorem 0 allows also an easy derivation of the 
c.f. of a complex normal r.vc. [12], [14]. 

Corollary 3: The c.f. of a complex normal r.vc. n is given 
by 


yjg(z) = exp( — —z H cov[n]z — — Re{z H pcov[n]z*} 
+ jRc{z h Eg[5] }) 

= exp( — - Re{(z, cov[n]z + pcov[n]z*)} 

+ jRc{(z,Eg[n])}). 


( 18 ) 

Proof: By Theorem 0 n = C{fj R + yff /) + /x. Let z = 
z R + jzj £ C p , and ff = ff R + jffi. Now 

<Pv( z ) =l Pm( z k) = ex P(-^( z R cov[t7r]zr)) 


= exp(-i(z^(/ p + diag(A))z fi 
+ Z/ {I p — diag(A))z/)) 

= exp(--(z R z R + zfz/ + z R diag(A)z fl 
- zj diag(A)z/)) 

= exp(—i(z H z + Re{z T diag(A)z})), 
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and by Eq. 

<Ps( z ) =^C^+m( Z ) = VCrfi 2 ) ex p(jRe{(z, /!.)}) 

=^{C H z) exp(jRe{(z, /r)}) 

= exp(— ~(z H CC H z + Re{z T C* diag(A)C H z})) 
exp(.?Re{z H /i}) 

= exp(— ^(z H CC H z + Re{z H C diag(A)C T z*}) 

+ jRe{z H n}). 

( 20 ) 

■ 

Corollary 0 shows in particular that the second characteris¬ 
tic function il>z = log<^£ of a complex r.vc. 3 ? is a second-order 
wide sense polynomial in variables (z, z*). Theorem|3can be 
also used to derive the density function of a complex normal 
r.vc. However, unlike the c.f., the density function of a wide 
sense normal r.vc. does not appear to have a simple form. See 
[12] for expressions for the density function in terms of the 
covariance and the pseudo-covariance matrices. The following 
example essentially shows that in some cases the distribution 
of a standard complex normal r.vc. is invariant to orthogonal 
transformations. 

Example 1: Let the components of n be uncorrelated com¬ 
plex normal r.v.s with the same circularity coefficient A. 
Now for a diagonal matrix A the r.vc. An is standard com¬ 
plex normal with the circularity spectrum (A • • • A) T , and 
for any (real-valued) orthonormal matrix O , cov[OAn] = 
O cov [An] O 11 = OI p O T = I p and pcov[OAn] = 

O pcov[An] O r = 0{\I p )0 T = A I p . Therefore, the r.vc. 
OAn is also standard complex normal. 

F. Darmois-Skitovich theorem for complex random variables 

One of the main characterization theorems for real r.v.s 
is the well-known Darmois-Skitovich theorem (see [4]). The 
theorem is fundamental for proving the identifiability of real 
ICA models [1], [5]. Here we extend the theorem to complex 

r.v.s. 

The proofs of the complex Darmois-Skitovich theorem and 
the proof of a closely related characterization theorem (The¬ 
orem |3 in Section |3 are both based on a complex functional 
equation (Lemma|3in Anncndixfllt. The functional equation is 
an extension of the corresponding equation for real variables 
(see, e.g.. Lemma 1.5.1 in [4]) to complex variables. Using 
the mapping m Lemma |3 may be easily seen to be a direct 
consequence of the real multivariate theorem [32] (see also 
[4], [33]). A direct proof is given in Appendix [H] for the sake 
of completeness. 

The complex extension of Darmois-Skitovich theorem has 
exactly the same form as the real theorem with the wide 
sense complex normal r.v.s taking the role of real normal r.v.s. 
Hence, this theorem is an example where the analogy [22] 
between theories of narrow sense complex normal r.v.s and 
real normal r.v.s is broken. 

Theorem 4 (Complex Darmois-Skitovich): Let s*,,..., s„ 
be mutually independent complex r.v.s. If the linear forms (the 


r.v.s) 

n n 

*i = £ ftfcSfc and x 2 = PkSk, (21) 

k= 1 k= 1 

where ctk, 0k € C, k = 1 ,... ,n, are independent, then r.v.s 
Sfc for which f 0 are complex normal. 

Sketch of the proof: The complete proof is given in Ap¬ 
pendix |n] and it follows the proof of the real-valued Darmois- 
Skitovich theorem (see [4]) with appropriate extensions to 
complex field. The idea is to consider two forms of the 
logarithm of the joint c.f. of xi and X 2 following from 
independence. This functional equation is only satisfied for 
wide sense polynomials showing that the r.v. xi is complex 
normal. This is only possible if r.v.s s*, are complex normal. 

■ 

Although narrow sense complex normal r.v.s had to be 
admitted to the complex Darmois-Skitovich theorem, it may 
still appear in the view of Corollary [2 that complex normal 
r.v.s appearing in the theorem can not be completely arbitrary. 
That is, it may appear that some of the circularity coefficients 
of normal r.v.s should be equal. It is true if n = 2. However, 
it is not generally true as it is shown in the next example. 

Example 2: Let fj\ = (ni,n 2 ,n 3 ) T be standard complex 
normal r.vc. with the circularity spectrum A[ 771 ] = (|, 2 , |) T . 
Then 772 = (3 -5 4 ) Vi is a l so standard complex normal 

r.vc. with the circularity spectrum A[ 772 ] = (2,2) T . Thus 
marginals of 772 are independent, and the Darmois-Skitovich 
theorem applies. However, the circularity spectrum of 771 is 
distinct. Notice also that by Example^ the r.vc. obtained from 
772 by multiplying with any orthogonal matrix is also standard 
complex normal r.vc. with the same circularity spectrum. 

III. Complex ICA Models 

In this section, we show that complex ICA is actually a 
well-defined concept, and we establish theoretical conditions 
similar to the real-valued case [5], In Section ID the main 
definitions along with some illustrative examples are given. 
Also a crucial characterization theorem giving a connection be¬ 
tween vector coefficients and complex normal r.v.s is proved. 
Finally, in sections Illl-BI IIII-CI and ID the conditions for 
separability, identifiability, and uniqueness of complex ICA 
models, respectively, are derived. 

A. Definitions and problem statement 

A general linear instantaneous complex-valued ICA model 
may be described by the equation 

x = As, (22) 

where (si,..., s m ) T = s are unknown complex-valued in¬ 
dependent non-degenerate r.v.s, i.e., sources , A is a complex 
constant p x m unknown mixing matrix, p > 2 , and x = 
(xi,...,x p ) T are mixtures , i.e., the observed complex r.vc. 
(sensor array output). The couple (. A , s) is called a representa¬ 
tion of r.vc. x. If no column in the mixing matrix A is collinear 
with another column in the matrix, i.e., all columns are pair¬ 
wise linearly independent, the representation is called reduced. 
All representations are assumed to be reduced throughout 
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this paper. Furthermore, a reduced representation for the r.vc. 
x in the model Q is called proper, if it satisfies all the 
assumptions made about the model. 

The model of Eq. I22l is defined to be 

(i) identifiable, or the mixing matrix is (essentially) unique, 
if in every proper representations (A, s) and {B, ?) of 
x, every column of complex matrix A is collinear with 
a column of complex matrix B and vice versa, 

(ii) unique if the model is identifiable and furthermore the 
source r.vc.s s and ? in different proper representations 
have the same distribution for some permutation up to 
changes of location and complex scale, and 

(iii) separable, if for every complex matrix W such that Wx 
has m independent components, we have APS = Wx 
for some diagonal matrix A with nonzero diagonals and 
permutation matrix P. Moreover, such a matrix W has 
to always exist. 

It is completely possible for the model o to be identifiable 
but not unique nor separable as it is shown in the next example. 

Example 3: As an example of a model which is identifiable 
but is not separable nor unique, consider independent non¬ 
normal r.v.s Sfc, k = 1,... 4. Let tj 4 , r] 2 , and 773 be independent 
standard normal r.v.s with the same circularity coefficient. 
Then also r.v.s rji + 772 and 771 — 772 are independent. Now 


/si + S 3 + S 4 + 771 + 7? 2 \ 
V s 2 + S 3 - S 4 + 7?i - 7 ? 2 ) 


0 1 1 \ 

1 Si \ 

S2 

1 1 -1) 

S 3 + m 


\S 4 + 772 / 


A 01 M 

/si + 771 + 772 \ 
S2 + 7?1 - 772 

Vo 1 1 - 1 ) 

S3 


\ s 4 / 


(23) 


which shows that the corresponding model can not be unique. 
However, it is identifiable. R.v.s of the form s + n, where n 
is a normal r.v. independent of s, are said to have a normal 
component. 

It follows from the reduction assumption that the number 
of columns, i.e., the number of sources or the model order, 
is the same in every proper representation of x in identifiable 
models. If W is a separating matrix, then linear manifolds of 
A P and W must coincide, and therefore p > rank(VL) = 
rank(AP) = m, i.e., there has to be at least as many mixtures 
as sources in a separable model. This fact also emphasizes that 
identifiability of the model C3 depends also on the linear 
operator structure, and since the linear operators defined on 
R 2p and C p are not isomorphic, one can not simply consider 
real-valued model with twice the observation dimension when 
studying the complex ICA model O- This is illustrated in 
the following example. 

Example 4: By simply considering real-valued models with 
twice the dimension, it may actually seem that the complex 
separation is possible only under very strict conditions. Indeed, 
let rfc, k = 1... ,4, be independent real-valued r.v.s, and let 
A 1 , A 2 , B 1 , and B 2 be 2 x 2 nonsingular real matrices. 


Define si = Ai(ri r 2 ) T and s 2 = A 2 (r 3 r 4 ) T . Now s 4 and 
s 2 are independent, but so are also y 4 and y 2 , 



for any permutation matrix P. However, y 4 and y 2 are 
mixtures of s'i and s 2 for many permutations P. 

The previous example is easily generalized to the ICA 
models that have multidimensional independent sources, i.e., 
one is looking for independent multidimensional subspaces. 
The example shows that such models can not be identified 
or separated without additional constraints on the internal 
dependency structure of the sources or the allowed mixing 
matrices. 

Since linear operators in complex and real spaces are 
not isomorphic, the classes of separable source r.v.s are not 
the same. That is, some source r.v.s considered in complex 
mixtures can be separated although their real-valued represen¬ 
tations in real mixtures can not. This is shown in the next 
example. 

Example 5: Let 771 ,..., r/ 2m be independent standard zero 
mean unit variance real Gaussian r.v.s. Define 


1 


1 


V= { 7 , A VrnVi+J>lm+i),-t=Wm-lri 2 +JVm+ 2 ), 

\Jm + 1 s/m 

■ ■ ; 2 m))- 

(25) 


Now it is easily seen that 77 * is a standard normal r.vc. with the 
distinct circularity spectrum A[?f] = ■ ■ ■ , 0) T . If 

7 / 1(1 is taken as the source r.vc. in the real-valued ICA model, 
i.e., y = Bffs. and B is a 2p x 2m real-valued matrix, p > m, 
the model is not separable [5]. However, the complex model 
involving ff itself, i.e., x = Arj and A is apx m complex¬ 
valued matrix, is separable by Corollary ^ 

The following characterization theorem is the base of the 
identifiablility and uniqueness theorems. It is an extension of 
a real theorem [4, Theorem 10.3.1] to the complex case. The 
idea of the proof is similar to the proof of Darmois-Skitovich 
theorem, and the proof given follows loosely that of the real 
counterpart with appropriate complex extensions. 

Theorem 5: Let (A, s) and (B. r) be two reduced repre¬ 
sentations of a p-dimensional complex r.vc. x, where A and B 
are constant complex matrices of dimensions pxm and pxn, 
respectively, and s = (s 4 ,..., s m ) T and ? = (r 4 ,..., r n ) T 
are complex r.vc.s with independent components. Then the 
following properties hold. 

(i) If the fcth column of A is not collinear with any column 
of B, then the r.v. s/. is complex normal. 

(ii) If the /cth column of A is collinear with the /th column 
of B, then the logarithms of the c.f.s of r.v.s s/ ;: and 17 
differ by a wide sense polynomial in a neighborhood of 
the origin. 

Proof: 

(i) By Lemma 0 (see Annendix llllt . there exists a 2 x p 
matrix C such that the fcth column of D 1 = CJA is 
not collinear with any other column of D\, or with any 
column of Z ? 2 = CB. Then Cx = D\S = £> 2 ?, and 
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applying Lemma HD (see Appendix Wil l it is seen that 
the r.v. Sfc is complex normal. 

(ii) By definitions the fcth column of A, say a, is collinear 
only with the 1th column of B, say (3. Therefore by 
Lemma 0 (see Appendix IIIH . there exists a 2 x p matrix 
C such that the fcth column of D\ = CA is not collinear 
with any other columns of D\, or with any column of 
D 2 = CB except possibly the 1th. Furthermore, since 
Col = C(cf3) = c(Cf3) for some c G C, it is seen that 
(D i, s) and ('LL- r) are reduced representations of Cx 
such that Lemma fFifiTl gives the claim. 


B. Separability 

ICA is commonly used as a Blind Source Separation- 
method, where the problem is to extract the original signals 
from the observed linear mixture. Therefore, separability of 
the ICA model is an important issue. The separability theorem 
for the complex ICA model below may be surprising, since it 
allows also separation of some complex normal mixtures. 

Theorem 6 (Separability): The model of Eq. <E3 is sepa¬ 
rable if and only if the complex mixing matrix A is of full 
column rank and there are no two complex normal source r.v.s 
with the same circularity coefficient. 

Proof: Suppose the model is separable. Since m = 
rank(VHA) < rank(A) < m, the mixing matrix A is 
of full column rank m. If there were two complex normal 
source r.v.s with the same circularity coefficient, by Example^ 
in Section HLEl there would exist matrices that produce m 
independent components but which are not diagonal matrices 
for any permutation of the columns. 

To the other direction, suppose the mixing matrix A is of 
full column rank and there are no two complex normal source 
r.v.s with the same circularity coefficient. Now A#, where the 
superscript f- denotes the Moore-Penrose generalized inverse 
[27], is a separating matrix. Suppose W is a matrix such 
that Wx has m independent components. If WA is not of 
the form AP, then there exist at least two columns such 
that they both contain at least two nonzero elements. By 
Lemma (see Appendix ED there can not exist only one 
such column since the sources are nondegenerate. Assume 
without loss of generality that the first l columns f3 k , k = 
1,..., l < m, of W A are columns with at least two nonzero 
elements, and denote the corresponding matrix of rank l by 
B = ( (3 1 ■ ■ ■ /3 ; ). By Theorem 0 the r.v. s/c corresponding 
to the column (3 k , k = 1 ..... /, is complex normal, and 
we assume, without loss of generality, that the r.vc. ffi = 
(si • ■ ■ si) 1 is standard complex normal. By Theorem usa 
(see Appendix m all components of 112 = Bi]\ are complex 
normal, and by Lemma |9] (see Annendix ED all components 
of n 2 are independent. Choose any l rows of B such that 
the corresponding submatrix B is of rank l, and B contains 
a row with two nonzero elements. Since B is not diagonal 
for any permutation by construction, fj 1 is standard, and 112 
has independent components, it follows from Corollary ^ that 
ffi can not have a distinct circularity spectrum, which is a 
contradiction. Therefore, WA is of the form AP, and the 
model is separable. ■ 


Remark 2: If the source s has finite second order statis¬ 
tics and the circularity spectrum A [s] is distinct, then the 
separation can be achieved by simply performing the strong- 
uncorrelating transform by Corollary 0 In this case, there is 
no additional restrictions on the distribution of the source r.v.s, 
and therefore some normal r.v.s can be also separated. An 
example of such a mixture is seen in Example 0 


C. Identifability 

Identifiability considers reconstruction of the mixing matrix. 
This is useful in some problems, where the immediate interest 
may not be in the sources themselves but in how they were 
mixed (e.g., channel matrix in MIMO communications). 

Theorem 7 (Identifiability): The model of eq. is iden¬ 
tifiable, if 

(i) no source r.v. is complex normal, or 

(ii) A is of full column rank and there are no two complex 
normal source r.v.s with the same circularity coefficient. 

Proof: 

(i) Since there are no complex normal r.v.s, by Theorem l5lil> . 
every column has to be collinear with exactly a column 
in another proper representation, i.e., the model is iden¬ 
tifiable. 

(ii) Let (A. s) and (B. ?) be proper representations of x. 
Since the model is separable by Theorem [ 6 ] and A 1 
is a separating matrix, A' B = PA for a permutation 
matrix P and a diagonal matrix A. By the uniqueness 
of the generalized inverse, it follows APA = B. 


There is a striking contrast between the two cases in 
Theorem 0 Namely, if there are more sources than mixtures 
not a single normal r.v. is allowed whereas in the other case 
all source r.v.s can be normal. The following example shows 
the reason why we can not allow a single normal r.v. for 
identifiability when there are more sources than sensors. 

Example 6: Consider independent non-normal r.v.s si,S 2 , 
and standard normal r.v.s 771 and 772 with the same circularity 
coefficient. Now 


/si + s 2 + 2 ? 7 i\ 
v S 1 + 2?72 ) 



and the last column shows that the model is not identifiable. 

It is evident from the previous example and from the 
separation theorem that another identifiability condition could 
be formulated by essentially allowing a single normal r.v. and 
not allowing other source r.v.s to have normal components 
with the same circularity coefficient. However, this condition 
is unnecessarily complicated. Therefore, it is not stated in a 
formal manner. 
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D. Uniqueness 

Uniqueness considers the case where one is interested not 
only in the mixing matrix but also in the distribution of the 
sources. 

Theorem 8 (Uniqueness): The model of Eq. <E3 is unique 
if either of the following properties hold. 

(i) The model is separable. 

(ii) All c.f.s of source r.v.s are analytic (or all c.f.s are 
non-vanishing), and none of the c.f.s has an exponential 
factor with a wide sense polynomial of degree at least 
two, i.e., no source r.v. has the c.f. tp such that p(z) = 
Pi(z) exp(V(z, z*)) for a c.f. tpi(z) and for some wide 
sense polynomial V(z,z*) of degree at least two. 

Proof: 

(i) Let(A, s) and (B. ?) be proper representations of x. By 
Theorem I7liil) the model is identifiable, and therefore 
AP A = B for a permutation matrix P and a diagonal 
matrix A. Now s = A*x = A#Br = PAr. 

(ii) There can not be any complex normal r.v.s, and therefore 
the model is identifiable by Theorem ED Now the 
logarithms of the c.f.s of the source variables in two 
proper representations differ by a wide sense polynomial 
by Theorem Uliik However, by the assumption this wide 
sense polynomial can be at most of degree 1, i.e., the 
source variables have the same distribution up to changes 
of location and complex scale. 


A nonunique but identifiable mixture was described in Ex¬ 
ample El By slightly restricting the allowed mixing matrices, 
it is possible in the real case to obtain more classes of unique 
models [5], Further work is needed to determine if those 
theorems can be extended to the complex case. 


IV. Conclusion 

In this paper conditions for separability, identifiablity, and 
uniqueness of complex-valued linear ICA models are estab¬ 
lished. Both circular and noncircular complex random vectors 
are covered by the results. So far these conditions have 
been known for real random vectors only. The conditions for 
identifiablity, and uniqueness are sufficient and the separability 
condition is also found to be necessary. In order to show these 
results, a proof of complex extension of the Darmois-Skitovich 
Theorem is constructed. Some second-order properties and 
characterizations of linear forms of complex random vectors 
are reviewed and new results found in the process of proving 
the theorem. As a by-product of establishing the conditions, 
a theorem on differential entropy for complex normal random 
vectors is proved and a slightly surprising result about sepa¬ 
rating complex Gaussian sources is found. 


Acknowledgment 

The authors wish to thank the anonymous reviewers for their 
valuable comments and suggestions. 


Appendix I 
Proof of LemmaEI 

Proof of Lemma 0 By Theorem Q there exist nonzero 
constants a, b £ C such that r.v.s s = ax and r = by are 
strongly uncorrelated. 

(i) Since cov[s^] + cov[s/] = 1, 0 < A[x] = A[s] = 
cov[sfl] — cov[s/] = 1 — 2cov[s/] < 1. Also ^(cx) = 
s, and thus by uniqueness A[cx] = A[s] = A[x]. 
Furthermore 


AW 


r | pcov [s] | | pcov [ax] | 

= PCOV S = -pr- = -r^y- 

cov|_sJ cov[axJ 

| a 2 pcov [x] | | a 2 11 pcov [x] | | pcov [x] | 

|a| 2 cov[x] |a| 2 cov[x] cov[x] 


(27) 

(ii) A[x] = 1 — 2cov[s/] = 1 if and only if cov[s/] = 0. 

(iii) Suppose A[x] > A[y], Using the first part of the 
lemma for an r.v. x + y, uncorrelateness, and the triangle 
inequality, we have 


A[x + y] 


| pcov [x + y] | _ | pcov [x] + pcov[y] j 
cov [x + y] cov [x] + cov [y] 


|pc Qv [is] +pcov[2r] 


cov [is] 

Tcovfir] 

|^pcov[s] 

+ 

p- pcov [r] | 

1<*P 

■ + 

w 

W A W + 

1 

b 2 4 

x [y]l 

w + 

1 

w 


/ w x i x 

] + 

iW A t y ] ^ 

- i 

r + 

i - 

w 


(28) 


which proves the inequality. 

If both r.v.s x and y are second order circular, then clearly 
the equality holds in (ED- Now suppose the condition for 
the equality holds in the noncircular case, and let A = 
A[x] = A[y] and 6 = Arg(pcov[x]) = Arg(pcov[y]). 
Then 

x r , ! I Pcov W + Pcov [y] | 

[X + yJ_ cov W + cov [y] 

= | A cov [x] e j6 + A cov [y] e jB \ ^ 

cov [x] + cov [y] 

| Ae je | j cov [x] + cov [y] | ^ 

cov [x] + cov [y] 

To the other direction, the last inequality in EH) holds 
with the equality iff A [x] = A [y]. If now A [x] ^ 0, then 
the triangle inequality in d28l holds with the equality iff 


6^ = b 2 pcov [s] pcovW nm 

a 2 a 2 pcov [r] pcov [y] 

Hence Arg(pcov[x]) = Arg(pcov[y]) by the polar 
forms of pcov [x] and pcov [y]. 
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Appendix II 

Proof of the complex Darmois-Skitovich theorem 

AND RELATED THEOREMS 

The following theorem is a direct consequence of the 
multivariate version of the real Marcinkiewicz theorem. The 
theorem shows essentially that a complex normal r.v. is the 
only r.v. whose second c.f. is a wide sense polynomial. 

Theorem 9 (Complex Marcinkiewicz): If in some neighbor¬ 
hood of zero the c.f. of a complex r.v. x admits the 
representation 


<P*(z) = exp(V{z,z*)), (31) 

where V is a wide sense polynomial, then the r.v. x is complex 
normal. 

Proof: Fix zq £ C, and define a c.f. ifoif) — fx{tz o) = 
exp(V(tz 0l tzy)) for t £ R. Then for some e > 0, logp 0 {t) 
is a polynomial in t, / < e. Therefore, by a version of a- 
decomposition theorem (see [34, Theorem 7.4.2]) the relation 
is valid for all t and ipo(t) is normal. Since zq is assumed to 
be arbitrary, it follows that the equation CD is valid for all 
z. By the last property of Lemma^ V(z, z*) is a polynomial 
in zr, and the claim follows from the multivarite (bivariate) 
Marcinkiewicz’s theorem (e.g., [29, Theorem 3.4.3]). ■ 

Also the well-known Cramer’s theorem has a direct complex 
counterpart. 

Theorem 10 (Complex Cramer): If S! and s 2 are indepen¬ 
dent r.v.s such that S! + s 2 is a complex normal r.v., then each 
of the r.v.s si and s 2 is complex normal. 

Proof: This is a direct corollary to the real multivariate 
Cramer’s theorem (e.g., [34, Theorem 6.3.2]). ■ 

Lemma 5: Consider the equation, assumed valid for 

N,M < e, 

P 

X>(*1 + c kZ2) = h\(z\) + ^ 2 ( 2 : 2 ), (32) 

k=l 

where f>k, k = 1 ,,p, hi, and h 2 are continuous complex¬ 
valued functions of complex variables and the nonzero com¬ 
plex numbers c k , k = 1 are distinct. Then all the 

functions in CD are wide sense polynomials in ( z, z*) of 
degree not exceeding p. 

Proof: Let d! k 1 = (1 — — )bi. Now, for small enough bi, 
we have 

p b P 

+bi + c k {z 2 --)) =y~l ^fcOi + d^ + c k z 2 ) 

fe=l Cp k— 1 

=hi{z\ + bi) + h 2 (z 2 -) 

Cp 

(33) 

by substituting (z\ + b\) for z\ and (z 2 — for z 2 in il32l . 
Subtracting CD from CD. we obtain 

p— ^ 2 . 1 1 

Y. A [ipk(zi + c k z 2 )] = A[/ii(zi)] + A [h 2 (z 2 )], (34) 

Z ' ,/(l) 6 l ~ b l 

k= 1 a k Cp 


where A [■] is the general difference operator defined by 

A [f(z)] =f(z + a ) - /(z) 

a 

and 


(35) 


n +1 


A [f(z)]= A [f(z + a n )~ f(z)] 

flOvj fl n ao,...,a n -i 


for any constants a k £ C. Equation d is of the same form 
as J32I except the number of the terms in the sum is lower. Let 
df = (1 — — )b 2 ■ Again by substituting and subtracting. 


obtain from I34> the equation 


p-2 


Y A U) k {zi+CkZ 2 )] = A [/ii(zi)]+ A [h 2 (z 2 )\. 

' (1) (2) L b 1 ,b 2 l J -!>' ->>■> L J 

k— 1 « ’ « 


~ b l ~ b 2 
c p ' c p — 1 


Continuing the process, we end up with the equation 

p -1 


*(,-■> W* 1+Cl * 2) ] 

p—1 p —1 

= A [h 1 {z 1 )]+ A 

Ol,...,Op_i -b 1 b p- 1 


[*- 2 (^ 2 )] - 


(36) 


(37) 


This is the generalized Cauchy’s equation for complex vari¬ 
ables [35] showing that A ^ 1 d ( P -i) [V’tC- 2 )] = az + bz* for 
some constants a,b £ C. Since coefficients b k are arbitrary 
in the neighborhood of zero, and by continuity, the difference 
operator structure [36] shows that f>i(z) is a wide sense poly¬ 
nomial in (z, z*) of degree not exceeding p. By renumbering, 
the same is obtained for ip k (z), k = 1 ,... ,p, and thus also 
for hi(z) and h 2 {z). ■ 

Proof of Theorem^ The joint c.f. of (xi,x 2 ) t is given 
as 

Pxi,x 2 (zi) z 2 ) 

= E X1 ,X 2 [exp(jRe{ {{z\, z 2 ) T , (xi, x 2 ) t )})] 


= E Xl)X2 [exp(jRe{((zi,z 2 ) T ,^(a fc s fe ,/3 fc s fc ) T )})] 

k= 1 
n 

= E XljX2 [exp(j^Re{(a fc Zi +/3 fe z 2 )sfc})] (38) 

k=l 

n 

= 11 Es fc [exp(^Re{(a fc 2:i + /3kZ 2 )sk})] 

k =1 
n 

= Vsk^kZl + f3kZ2), 


k =1 


z\, z 2 £ C, by independence of r.v.s s k , k = 1,..., n. On the 
other hand, by independence of xi and x 2 , we have 


(39) 


Px u x 2 (zi,Z 2 ) =ip Xl (zi)p X2 (z 2 ) 

n n 

= II Ps k {a k zi) p Sk {(3kZ 2 ). 

k— 1 k— 1 

Thus by combining equations CD and < 1391 . we get 

n n n 

J| Ps k (a k zi+p k z 2 ) = ps^akZ!) ip Sk ((3 k z 2 ). (40) 


k=l 


k=l 


k=l 
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As always, there exists a neighborhood of zero such that all 
c.f.s in Eq. d40l are nonzero. Let r k = a* k s k and c k = (3k/atk 
for ctfc ^ 0, and Ck = (3k for a k = 0. Then, by Eq. (II PL we 
can rewrite Eq. d40l for some positive e > |zi|, \z 2 \ by setting 

t/ife = log (f Ik as 

i i i 

+ Ckz 2 ) = ^ipkizi) +y ^ipkjckZi), (4i) 

fc=1 fe=1 k=l 

where it is assumed without loss of generality that l first r.v.s 
Tk, k = 1 are such that a k (3k ^ 0, and therefore 

components i/q, k > l, cancel out. By combining functions 
ipk with the equal arguments to a single function r t( and 
renumbering, Eq. ED may be rewritten as 

q l q 

i>k{zi+c k z 2 ) =+y^yj fc (c fc 0 2 ) (42) 

k —1 fc =1 fc =1 

such that numbers c k , k = 1,... ,q < l, are distinct. Therefore, 
i'0fc(- s i) i s a wide sense polynomial by Lemma |D By 
Theorem |D the r.v. iq is complex normal. Thus by 

Theorem 1 101 each r.v. r fc , and hence each r.v. s^, k = 1,..., l, 
is complex normal. ■ 


Appendix III 

Additional characterization lemmas 


Lemma 6: Let aq,..., a m be given nonzero vectors of an 
inner product space. Then there exist a vector (3, which is not 
orthogonal to any of the given vectors. 

Proof: Suppose (3 is not orthogonal to any oq, l = 
1,..., k— 1, but is orthogonal to aq. Then a scalar c £ C can 
be chosen such that ((3, oq) ^ —c(oL k , oq) for all l < k. Now 
the vector (3 = (3 + cct k is not orthogonal to any oq, l < k. 

Since oq is nonzero, (3 1 = aq is not orthogonal to a \. 
Choose (3 2 = (3\ + c 2 a 2 , where c 2 is a scalar as above if (3 1 
is orthogonal to a. 2 , and c 2 = 0 otherwise. By iterating the 
procedure to — 1 times, it is seen that (3 m is a required type 
of vector. ■ 

Lemma 7: Let ctq,..., OL m be given p-dimensional nonzero 
complex vectors such that a .i is not collinear with any OL k , 
k 1. Then there exists a 2 x p matrix C such that Col i is 
not collinear with any Ca^, k / 1 . 

Proof: Denote OL k = (a k i, • • •, a kp ) T , k = 1,..., to. 
Without loss of generality we assume that the coefficients a hi* 
k = 1,... ,m, are either zero or one. Furthermore, we may 
take an = 1 by permutating the original indices. 

Suppose oli is not collinear with a*,, i.e., c*i a.k , for 
any k 1. Define 


C = 


1 0 

(3i (3 2 


0 

(3 P 


where (3 = ((3\,... , (3 p ) T is a vector such that 


(43) 


Cai can be collinear with another vector Crx k only if aki = 
1. But then the difference 

Cai ~ C( * k = (/3 T ai) " {(3 T a k ) = (</3, (m - a fc )*)) 

(45) 

is not zero by construction. Thus Col\ is not collinear with 
any CoL k , k f \, and C is a required type of matrix. ■ 

Lemma 8: Let (4, s) and (B.t) be two reduced represen¬ 
tations of a 2-dimensional complex r.vc. x, where A and B 
are constant complex matrices of dimensions 2 x to and 2 xn 
respectively, and s = (si,..., s m ) T and ? = (iq,..., r n ) T 
are complex r.vc.s with independent components. Then the 
following properties hold. 

(i) If the fcth column of A is not collinear with any column 
of B, then the r.v. s k is complex normal. 

(ii) If the /cth column of A is collinear with the /th column 
of B. then the logarithms of the c.f.s of s k and 17 differ 
by a wide sense polynomial in a neighborhood of the 
origin. 

Proof: 

(i) Without loss of generality we assume that matrices A 
and B are scaled such that the first rows consist only of 
zeros and ones. This amounts only to the scale of r.v.s 
s 1 and r.v.s 17 . Furthermore, since the components of x 
can be interchanged if necessary, the first entry of the 
kth column of A can be taken to be one. 

As always, there exists a neighborhood s > 0 of zero 
such that all c.f.s are nonzero, and the logarithms of 
c.f.s are well-defined. Therefore for z = (zi, z 2 ) T £ C 2 , 
\zi\ < e, \z 2 \ < £, we have using the properties m and 
@ that 

log <Pf(z) =\ogip s (A H z) = \og(p?{B H z) 

m 

= '^2\ogLp sl {a* 1 iZ 1 + a 2l z 2 ) (46) 
1 = 1 
n 

= lo g (P 11 Zl + P 21 z 2 ), (47) 

;=1 

where A = ( a q i ), B = ((3 q i). Let q be the number of 
different noncollinear columns with nonzero coefficients 
in A and B other than the fcth column of A. Now 
substituting E3 from El- and combining the terms with 
equal nonzero coefficient arguments to functions hi, and 
with one zero coefficient to / and g, respectively, we get 
an equation of the form 
q 

\og(p Sk {z 1 +al k z 2 )+'^2hi(z 1 +'Yiz 2 ) = f{zi)+g(z 2 ) 
1 = 1 

(48) 

if OL 2 k 7 ^ 0 , and of the form 

Q 

+n z 2 ) = log <p Sk (zi) +g{z 2 ) (49) 

;=i 


((3, (oli — OLk)*) 7 ^ 0, k = 2...,m. (44) 

By Lemma |D such a vector (3 exists. Now vectors CoL k are 
again such that the first component is either zero or one. Thus 


if oi 2 k = 0. Numbers a 2k , 71 ,..., 7 ? are now distinct, 
and then by Lemma |D log ip Sk must be a wide sense 
polynomial in ( z,z*) of degree not exceeding q. Thus 
by Theorem |D the r.v. s/, : is complex normal. 
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(ii) By definitions of representations, kth column of A is 
collinear only with the /th column of 13. Thus one of the 
ft’s in the proof of part 10 is the difference the logarithms 
of the c.f.s of Sfc and rand the claim follows from 
Lemma [5] 

■ 

Lemma 9: Suppose independent complex r.v.s si and S 2 
are independent of complex normal r.v.s ni and 112 . If si +ni 
is independent of S 2 + 112 , then also ni and tl 2 are independent. 

Proof: Since the r.vc. (s 1 ,S 2 ) T is independent of the 
r.vc. (ni,n 2 ) T , the joint c.f. can be written as 

< / , si+ni,S2+n 2 \ z li ~2 ) 

=‘Ps 1 ,s 2 {zuZ 2 )‘Pn 1 ,n 2 (zi,Z 2 ) (50) 

=‘Ps 1 { z l) ( Ps 2 {z 2 )(fin 1 .n 2 (zi, Z 2 ) ■ 

On the other hand, using the independence of si + ni and 
S 2 + H 2 > we have 

( / 7 si+ni,S2+n 2 (^lj z 2) = ¥’si+ni (2l)<^s 2 +n 2 ( z 2 ) (51) 

=<Psi (ziVs 2 (z 2 )tpn 2 (z 2 ), 

and therefore 

Ps 1 {zi)<Ps 2 (z 2 )<Pn 1 ,n 2 (zi, Z 2 ) 

=<Ps 1 ( z l)<Ps 2 (Z 2 )(f ni (Zl)<pn 2 (Z 2 ). 

Then, in some neighborhood of zero, all c.f.s in 
nonzero, and we have 

Pn u n 2 (zi,Z 2 ) = ¥>m0siVn 2 (22) 

in the neighborhood. By the a-decomposition theorem [34, 
Theorem 7.4.2], the equation if valid for all z\ and z 2 , i.e., ni 
and n 2 are independent. ■ 

Lemma 10: If complex r.v.s n and s are independent and 
n+s is independent of n, then n is degenerate (i.e., a constant). 

Proof: By Theorem the r.v. n is complex normal. As 
in the proof of Lemma [9] it follows that the equation 

<Pn( z l + Z 2 ) = Pn(Zl)iPn(z 2 ) (54) 

is satisfied in a neighborhood of zero. This is only possible if 
n is a degenerate complex normal r.v., i.e., a complex normal 
r.v. with zero variance. ■ 


(52) 
(l52l are 

(53) 
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